Statistics using R (spring of 2017) Computer lab: Multivariate Plots, Principal Component Analysis and Discriminant Analysis
ثبت نشده
چکیده
Consider a data matrix X of real numbers with n rows and k columns. We view the rows X1, . . . ,Xn as independent samples from a multivariate distribution with mean vector μ and covariance matrix Σ. Can we visualize X in some easy way? One possibility is to use heat plots (also known as heat maps). These are constructed as follows: think of a rectangular matrix in which column (or row) standardized observation numbers are substituted by colored squares in a continuous color scale, e.g. intensely green color for very negative values, intensely red color for very positive values and some suitable interpolation for the numbers in between. By sorting both rows and columns in various ways, one can get a convenient visual overview of all the vectors. Observe that the data structure and information after such sorting is kept fully intact. In genomics such plots were introduced by Eisen et al. (1998), and in this branch they are often called Eisen-plots after him. An interesting historical sketch is given in (Wilkinson and Friendly, 2009). There are many options available for hierarchical clustering. It can be performed in a sequential manner from the root of a tree (divisive methods) or it can start by merging leaves in the tree (agglomerative methods). The measures of similarity or distance between objects and the groups of objects that are clustered can be varied in many different ways. A quite pedagogical account of traditional clustering terminology and ideas is given by Jain et al. (1999). One option for the sorting is to use hierarchical clustering and display the result as trees in which observation vectors that are similar are displayed near to each other in the row sorting and highly correlated column patterns are also sorted together.
منابع مشابه
Feature reduction of hyperspectral images: Discriminant analysis and the first principal component
When the number of training samples is limited, feature reduction plays an important role in classification of hyperspectral images. In this paper, we propose a supervised feature extraction method based on discriminant analysis (DA) which uses the first principal component (PC1) to weight the scatter matrices. The proposed method, called DA-PC1, copes with the small sample size problem and has...
متن کاملThe NPAIRS Computational Statistics Framework for Data Analysis in Neuroimaging
We introduce the role of resampling and prediction (p) metrics for flexible discriminant modeling in neuroimaging, and highlight the importance of combining these with measurements of the reproducibility (r) of extracted brain activation patterns. Using the NPAIRS resampling framework we illustrate the use of (p, r) plots as a function of the size of the principal component subspace (Q) for a p...
متن کاملInvestors' Perception of Bank Risk Management: Multivariate Analysis Techniques
According to the nature of their activities, banks are exposed to various types of risks. Hence, risk management is at the heart of financial institutions management. In this study, we intend to summarize the information content of bank financial statements on diverse risks faced by banks and then determine how stock markets react to bank's risk management behavior. The methodology used in this...
متن کاملDiscrimination of Golab apple storage time using acoustic impulse response and LDA and QDA discriminant analysis techniques
ABSTRACT- Firmness is one of the most important quality indicators for apple fruits, which is highly correlated with the storage time. The acoustic impulse response technique is one of the most commonly used nondestructive detection methods for evaluating apple firmness. This paper presents a non-destructive method for classification of Iranian apple (Malus domestica Borkh. cv. Golab) according...
متن کاملDetection and Classification of Bacteria using Raman Spectroscopy Combined with Multivariate Analysis
Vibrational spectroscopic techniques have advantages over conventional microbiological approaches towards identification & detection of pathogens. Since unique spectral fingerprint is obtained, one can identify very closely related bacteria using such methods. In this study Raman microspectroscopy in combination with chemometric method has been used to classify four strains of E. coli (two path...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2017